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1. INTRODUCTION 

The runtime standard of the software is tested to maximum limits for qualitative software [1]. 
Software industry suffers with a heavy loss of $500 billion per year due to decrease in software quality or 
software failure [2]. Software failure caused by different faults and those faults can be detected by 
software testing. For high quality software that, satisfies the user specifications and requirements, testing are 
required [3]. Software testing is the process of finding and resolving the error (s) through which, software 
quality can be improved. The error(s) can be identified by executing the code with a set of test inputs called 
as test data or test case [4-5], where test case is a triplet defined as [I, S, O], I is the input data to the system, 
S is the state of the system at which data is input and O is the expected output [6-7]. Test case generation and 
test case execution require lots of effort. It is not possible always as there is no limit on test data generation 
but we have a limit to the cost and time of the testing process [8]. It is very time consuming, less reliable, 
incomplete coverage and risky process as it suffers from the drawbacks such as operation speed, high 
investment of cost and time, limited availability of resources, redundancy of test cases, inefficient and 
inaccurate test checking [9]. These drawbacks can be overcome by automated testing, which leads to 
decrease in cost and time of testing process. It is the most important aspect of automatic testing. So in recent 
day’s automated software testing, and developing of high quality test cases, are two main objectives for each 
and every software industry [9-10]. Software testing can be broadly divided in two different ways as random 
based testing and search based testing [11]. 

a. Random Based Software Testing (RBST) 


b. Search Based Software Testing (SBST) 
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Random process is the simplest way for generating test data, but the probability of satisfying the 
constraints of the tested programs is very low. It simply executes the program with random inputs and check 
whether the expected output is satisfied or not. One of the major problem in RBST is sometimes none of the 
test data reaches the target test data often called as critical data [12]. But search based approach is widely 
used in recent years to solve many optimization problems in the field of Search Based Software Testing 
(SBST). In search based technique the target criterion is converted to an optimization problem, so that 
different types of Evolutionary Algorithms (EAs) such as GA, PSO, ACO, ABCO etc. are used to solve the 
specified problem by providing a global optimum or nearer optimal solution [13-14]. 

This paper presents a systematic review on test data generation for path testing using different EAs. 
The rest of the paper is organized as: Section 2 describes some basic concepts, which are used in our research 
paper. Section 3 discusses related work on path coverage based testing using different EAs. In Section 4, the 
conclusion of the paper along with some future works are outlined. 


2. BASIC CONCEPTS 
In this section a few background concepts and definitions arediscussed, which are used throughout 
the research work. 


2.1. Path Testing 

Basically, testing is done in four different levelsviz. unit testing, integration testing, system testing 
and acceptance testing. Among all kinds of software testing techniques,unit testing is the base of all other 
types of testing [15]. It is done in the coding stage and if it is not done properly, thecost and time for other 
testing will increase. So unit testing plays an important role in maintaining the software quality [14]. There 

are two different ways to generate test cases for unit level testing as: [16-18]. 

a. White box approach (Glass box or structural approach). 

b. Black box approach (Functional approach). 

c. Structural testing mainly involves testing process of a unit or modules and is very important for software 
developer. To test an unit or a module, different coverage based testing techniques are used such as 
statement coverage, condition coverage, multiple conditions coverage and path coverage [17], [19]. 
Among all coverage based testing, path coverage based testing is the strongest criterion based testing as it 
can detects about 65% of defects present in a Software Under Test (SUT). Literature says that many 
studies have been already done for unit testing but it is seen that a less focus has been paid towards path 
testing [19-20]. 

d. Path testing was first introduced by Howden in 1976.It allows finding a logical error(s) as errors/faults 
associated with different number of iterations that are exposed in different paths. The detection of logical 
errors may not possible in case of branch or statement coverage based testing [18]. In path testing, test 
cases are designed in such a way that all linearly independent paths of a particular SUT, should be 
executed at least once. A linearly independent path can be obtained from the Control Flow Graph (CFG) 
of a program which shows the flow of sequence in a program [20-21]. McCabe’s Cyclomatic Complexity 
gives the upper bound value of the linearly independent paths present in a program. The CC of a program 
can be found by using (1). 


Versa N? (1) 


One example is shown in Figure 1, which shows the different steps carried out to find the linearly 
independent paths for a specific program. With the help of path testing, the test cases are created 
and executed for all possible paths which results in 100% statement coverage and 100% branch 
coverage [21-22]. 


2.2. Critical Path 

During path testing, the path for which there is no test data generated and searching for the test data 
to cover that particular path can never be succeed, is called as Critical path.In such cases some criterion is 
needed to stop the searching process for the test data that covers the critical path and it is almost sure that 
further searching is worthless [23-26]. 
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Program to find the GCD of two numbers [3] 


int GCD (int a, int b) 





1. while (a! = b) 
2. {if(a>b) then 
3. a=a-b; 
% 
4. else b=b-a; 
5.3 
6. return a; 


Figure |. CFG for GCD program 


The linearly independent paths are as follows 


1-6 
12-43-56 
12-456 


2.3. Evolutionary Algorithm (EA) 

Evolutionary algorithms (EAs) are based on biological behavior or evolution of population [27-28]. 
This algorithm is based on the principle of survival ofthe fittest and models some natural phenomena like 
genetic inheritance and Darwinianstrife for survival, constitute an interesting category of modern heuristic 
search [29-30].A strategy has been developed, to greatly reduce the necessary time and computational 
costs to achieve maximum benefits in the form of soft computing techniques like Genetic Algorithm (GA), 
Particle Swarm Optimization (PSO), Ant Colony Optimization (ACO), Artificial Bee Colony Optimization 
(ABCO) etc. [30-31]. 


2.3.1 Genetic Algorithm (GA) 

Genetic Algorithm (GA) is an evolutionary algorithm, which was developed by John Holland in 
1975 [27-28]. GA has emerged as a practical, robust optimization technique and search method. It is inspired 
by the way nature evolves species using natural selection of the fittest individuals [29]. 

In the process of solving the problem, GA comes across a set of feasible solutions, called as the 
search space, where the individual solution is marked by its fitness value or score towards the problem. The 
fitness value or score of the individuals is determined through a predefined fitness function. This value 
defines the fitness of individual solution towards the given problem and facilitates the decision making 
regarding, which solution is to be included and which is to be discarded for next generation.It always aims 
for the optimal solution using some extreme value like searching for a minimum or maximum in the search 
space[30-31]. The algorithm provides the global optimum solution by employing its different operators, such 
as Selection, crossover, mutation and elitism [32-34]. 

It is an 8-tupled expression defined in (2) [25], [35-36]. 


GA = (Co,F, Po,N,S,C,M,T) (2) 


Where,Co = Individual coding method, F =Individual fitness evaluation, Po= Initial population 
N= Population scale, S= Preferred selection operator, C =Preferred crossover operator 
M=Preferred Mutation operator, T =Suspension of operation algebra. 


2.3.2 Particle Swarm Optimization (PSO) 

PSO is an evolutionary computation technique developed by kennedy and eberhart in 1995, that 
studies the social behavior of bird flocking or fish schooling. It begins with a group of randomly generated 
individuals called as initial population. The best solution can be found by a number of particles constituting a 
swarm, moving around in a particular real valued N-dimensional search space and adjusting their flying 
according to own and other's flying experience [37]. A fitness is defined to evaluate each particle from the 


Indonesian J Elec Eng & Comp Sci, Vol. 15, No. 1, July 2019 : 504 - 510 


Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 Oo 507 





population. In this process each particle is asigned a coordinates in the form of location and velocity, which 
are associated with best solution. At each steps of this process the velocity of each particle is changed to 
achieve best fitness(pbest) than the overall best(gbest) value obtained by any particle in the population. The 
particles are called as potential solution. Acceleration is weighted by a random term w called as weight 
inertia. The velocity and new particles can be updated by using (3) and (4).So different new population are 
generated for acceleration towards pbest and gbest locations. PSO has a premature convergence problem ie. 
It convergence to the local best solution [38-39]. 


(t+1) t i t 
Vv, =Wwy,+¢74(p,; —x')+c,7(g; ) 


(3) 


xD 


L 


+X; (4) 


yore t 
2.3.3 Ant Colony Optimization (ACO) 

ACO is a distributed meta-heuristic algorithm, inspired by biological behaviors of real-world ants 
mainly used to solve many optimization problems. The Aunt System Algorithm was first proposed by Marco 
Dorigo et al. in 1991 to solve combinatorial discrete optimization problems [40]. In this algorithm, the 
optimization problem is represented as a graph and the artificial ants move around the paths of the graph 
repeatedly to find the best solution. During food searching the ants leave a chemical level called as 
pheromone on the randomly travelled paths so that other ants can coordinates with each other. Ants select 
their paths according to the higher pheromone levels of the graph edges. After some traversals the pheromon 
level of shortest path becomes higher than the others because the pheromon evaporation is more in longer 
paths.For each iteration possible solutions are created and finally evaluate the best quality solution by using a 
heuristic measure [41-42]. 


2.3.4 Artificial Bee Colony Optimization (ABCO) 

ABCO is a population based process in which the independent and parallel of the scout bees, 
employed bees onlooker bees finds the global optimum solution faster. It is a non pheromenon based 
approach so no need of updation [43]. In this process the computational overhead and memory limit 
problems are balanced. Here some dedicated scout bees are appointed to explore flower patches in the 
sorrounding environment at random. The fitness value of a perticular flower patch is defined by taking the 
nectar amount, the distance and the direction of the designated flower patch from the bee hive. Scout bees 
gives the information to the onlooker bees in form of waggle dance and then the onlooker bees determine the 
fitness value and the probability value of the food source. The food souce with maximum profitability is 
selected for the exploration [44]. 


3. RELATED WORK 

EAs are frequently used for path coverage based test data generation and optimization to achieve 
maximum path coverage. In this section, a few related research works on software path testing using variants 
of EAs has been discussed. 


3.1. Test Case Generation and Optimization using GA 

Hermadi et al. [19] developed a GA based approach to cover multiple path at a time. They have 
applied different fitness on several bench mark problems. The fitness was designed by combining the features 
of path traversal, neighborhood influence and normalization. They found the new GA based multiple path 
test data generator gives better results than the previous method. Cao et al. [13] developed an approach to 
generate test data, that covers only one specific path for a SUT. GA is used to increase the path coverage of a 
SUT for achieving the goals like better quality and reliability and the fitness is designed by taking the 
Overlapped Path Similarity (OPS) between the executed path and target path. The most popular program, 
TCP is taken for their experiments and found that GA based OPS can generate a huge number of valid test 
data with less consumption of time. Garg et al. [26] proposed a new fitness named as Extended Level Branch 
(EXLB) for basic path testing using simple GA by using hill climbing method with selection operator, but the 
proposed method could not cover all paths. Zhu et al. [34], 2017 proposed an improved GA to balance the 
load of each calculation resources in target paths of a SUT. A grouping strategy of target paths is defined by 
taking the common constraints of the target paths, to reduce the search space of test data. Symbolic execution 
tool along with GA is used to accelerate the convergence of search process, which leads to improve the 
efficiency of SBST. The proposed approach is implemented with four different programs such as bubble sort, 
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insertion sort, select sort, and shell sort. The experimental results shows a very good performance in terms of 
both efficiency and load balancing of generated test data. 


3.2. Test Case Generation and Optimization using PSO 

Latiu et al. [37] used three different evolutionary algorithms to generate test data automatically for 
path testing and found evolutionary testing strategies are very well suited to generate test data for a software 
program. They have used GA, PSO, SA (Simulated Annealing) for their experiments. Huang et al. [38] 
proposed a method SAF-GPSO (Swarm Activity Feedback-Gauss Particle Swarm Optimization), based on 
Improved PSO for multipath test case generation and found their method gives better result in comparison to 
GA and PSO. Han et al. [21] proposed a modified multiple path test data generator using PSO. Authors have 
taken some bench mark problems and found their proposed approach is more effective and efficient for 
complicated and large path sets. They have taken different population size and from the experimental result, 
it is observed that the average iteration needed to cover all feasible paths, is decreases when population 
size increases. 


3.3. Test case generation and optimization using ACO 

Biswas et al. [40] proposed an ACO based approach that guarantees full software coverage with 
minimum redundancy. The proposed approach can generate optimal path in a prioritized order and also 
generates test data sequence within the domain to use as inputs of the generated paths. Mann et al [42] 
proposed an ACO based path prioritization to generate maximum path coverage test data. The algorithm is 
named as PP-ACO, which is used to generate optimal path sequence in DD graph for a SUT. The most 
popular search based program as TCP is taken for the experiment and the reported result shows that, the 
proposed technique can generates test data for full path coverage as well as prioritizes those test data 
according to the path strength. 


3.4. Test case generation and optimization using ABC 

Lam et al. [43] presented an approach for automatic generation of feasible independent test path by 
using edge coverage criteria. They have used ABC optimization technique to optimize test suite and show the 
efficiency of their proposed method by comparing with previous related approaches, but the proposed 
approach could not eliminate the duplicate test data in the final test suite. Khari et al. [44] developed an 
automated testing tool for test suite generation and optimization to test a software using ABC. Their proposed 
method is able to provide a set of minimal test case with maximum path coverage. Authors have compared 
their result with CSA (Cuckoo Search Algorithm) and found the ABC based method offers better result than 
CSA in terms of path coverage. The reported results show that 90.3% path coverage is achieved for ABC 
whereas only 75.4% path coverage is achieved for CSA. 


4. CONCLUSION 

This paper briefly reviewed some of the related research work on path coverage based testing, one 
of the white box testing technique using different EAs viz. GA, PSO, ACO, ABCO. In path testing test data 
is generated to cover the basic path of a specific SUT. It is observed that different EAs are frequently used 
for test case generation and optimization to cover the basic path. However, as we move to higher path 
coverage based test suite with more complex software, more efficient methods are needed. Researchers have 
employed many different approaches, to achieve maximum coverage. However, it’s very difficult to achieve 
100% path coverage in complex software i.e. in terms of LOC and a large number of test data is required 
towards achieving a maximum. As a result, numerous algorithms have been proposed, implemented and 
applied to achieve highest coverage over the past decades. Presence of critical paths is also one main issues 
in achieving full path coverage. So detecting and generating the test data for a specific critical path is a very 
challenging issue during path testing. 

In future it is planned, to develop an efficient EA based algorithm which, generates an optimized 
test suite to satisfy maximum path coverage for any SUT and simultaneously, validate the effectiveness and 
efficiency of the proposed algorithm in covering the most critical paths. It is also planned to design a real 
coded GA to generate and optimize the test data with maximum path coverage and minimum test data 
generation count. 
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